Left-skewed Lévy distribution (levy_l)#
scipy.stats.levy_l is a continuous distribution with support \((-\infty, \mathrm{loc})\) (standard: \((-\infty, 0)\)). It is extremely heavy-tailed: the mean and variance do not exist as finite numbers.
A useful generative story is a simple transformation of a standard normal: if \(Z\sim\mathcal{N}(0,1)\) then
has the levy_l distribution.
Learning goals#
Write down the PDF/CDF and connect the CDF to the normal CDF / error function.
Understand the relationship to
levy,levy_stable, and the inverse-gamma family.See why moments diverge and which summaries remain meaningful (quantiles, median).
Implement NumPy-only sampling and validate against SciPy.
Fit parameters and use the distribution in simple inference workflows.
import numpy as np
import plotly
import plotly.express as px
import plotly.graph_objects as go
import os
import plotly.io as pio
from plotly.subplots import make_subplots
import scipy
from scipy import optimize, special
from scipy.stats import levy_l as levy_l_dist
from scipy.stats import norm
pio.templates.default = "plotly_white"
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")
np.set_printoptions(precision=6, suppress=True)
rng = np.random.default_rng(42)
# Record versions for reproducibility (useful when numerical details matter).
VERSIONS = {"numpy": np.__version__, "scipy": scipy.__version__, "plotly": plotly.__version__}
VERSIONS
{'numpy': '1.26.2', 'scipy': '1.15.0', 'plotly': '6.5.2'}
1) Title & Classification#
Name:
levy_l(left-skewed Lévy distribution; SciPy:scipy.stats.levy_l)Type: continuous
Support:
Standard form: \(x<0\)
With
loc,scale: \(x < \mathrm{loc}\)
Parameter space (SciPy location/scale form):
\(\mathrm{loc}\in\mathbb{R}\)
\(\mathrm{scale}>0\)
There are no additional shape parameters; levy_l is a 2-parameter location/scale family.
2) Intuition & Motivation#
What it models#
levy_l is a one-sided (fully left-skewed) stable distribution with stability index \(\alpha=1/2\).
It places most of its mass near the upper end of its support (near loc), but has a very heavy left tail. In practice this means:
You typically see many values close-ish to
loc.Rarely, you can see enormous negative outliers.
Classical summaries that rely on finite moments (mean/variance) are unreliable.
Typical real-world use cases#
First-passage times (Lévy): the (right-skewed) Lévy distribution arises as the distribution of the first time a driftless Brownian motion hits a fixed positive level.
levy_lis the mirror image, useful when modeling an upper-bounded quantity with heavy lower tail.One-sided heavy-tailed noise: as a component in mixture models for data with rare but extreme negative shocks.
Stable-process building block:
levy_lcorresponds to the fully left-skewed \(\alpha=1/2\) stable law (seescipy.stats.levy_stable).
Relations to other distributions#
Mirror of
levy: if \(Y\sim\texttt{levy}(0,1)\) then \(-Y\sim\texttt{levy\_l}(0,1)\). More generally, $\(X\sim\texttt{levy\_l}(\mathrm{loc},\mathrm{scale}) \iff \mathrm{loc}-X\sim\texttt{levy}(0,\mathrm{scale}).\)$Inverse-gamma: if \(X\sim\texttt{levy\_l}(\mathrm{loc},\mathrm{scale})\) then \(\mathrm{loc}-X\) follows an inverse-gamma distribution with shape \(\alpha=\tfrac12\) and scale parameter \(\beta=\tfrac{\mathrm{scale}}{2}\).
Stable law:
levy_lis the same aslevy_stablewith parameters \((\alpha,\beta)=(1/2,-1)\) (up to SciPy’s parameterization conventions).
3) Formal Definition#
PDF#
In SciPy’s location/scale form, for \(x<\mathrm{loc}\),
and \(f(x)=0\) for \(x\ge\mathrm{loc}\).
The standard form (loc=0, scale=1) simplifies to
CDF#
For \(x<\mathrm{loc}\),
and \(F(x)=1\) for \(x\ge\mathrm{loc}\). Here \(\Phi\) is the standard normal CDF and \(\operatorname{erf}\) is the error function.
Quantile function (PPF)#
For \(q\in(0,1)\), let \(z = \Phi^{-1}\!\left(\tfrac{q+1}{2}\right)\). Then
def levy_l_logpdf(x: np.ndarray, loc: float = 0.0, scale: float = 1.0) -> np.ndarray:
"""Log-PDF of levy_l(loc, scale) evaluated at x.
SciPy's support is x < loc with scale > 0.
"""
x = np.asarray(x, dtype=float)
if not np.isfinite(loc):
raise ValueError("loc must be finite")
if not np.isfinite(scale) or scale <= 0:
raise ValueError("scale must be positive and finite")
y = loc - x
out = np.full_like(x, fill_value=-np.inf, dtype=float)
mask = y > 0
yy = y[mask]
out[mask] = (
0.5 * np.log(scale)
- 0.5 * np.log(2 * np.pi)
- 1.5 * np.log(yy)
- scale / (2 * yy)
)
return out
def levy_l_pdf(x: np.ndarray, loc: float = 0.0, scale: float = 1.0) -> np.ndarray:
"""PDF of levy_l(loc, scale) evaluated at x."""
return np.exp(levy_l_logpdf(x, loc=loc, scale=scale))
def levy_l_cdf(x: np.ndarray, loc: float = 0.0, scale: float = 1.0) -> np.ndarray:
"""CDF of levy_l(loc, scale) evaluated at x."""
x = np.asarray(x, dtype=float)
if not np.isfinite(loc):
raise ValueError("loc must be finite")
if not np.isfinite(scale) or scale <= 0:
raise ValueError("scale must be positive and finite")
y = loc - x
out = np.zeros_like(x, dtype=float)
mask = y > 0
yy = y[mask]
out[mask] = special.erf(np.sqrt(scale / (2 * yy)))
out[~mask] = 1.0
return out
def levy_l_ppf(q: np.ndarray, loc: float = 0.0, scale: float = 1.0) -> np.ndarray:
"""Quantile function (inverse CDF) for levy_l(loc, scale)."""
q = np.asarray(q, dtype=float)
if not np.isfinite(loc):
raise ValueError("loc must be finite")
if not np.isfinite(scale) or scale <= 0:
raise ValueError("scale must be positive and finite")
if np.any((q <= 0) | (q >= 1)):
raise ValueError("q must lie strictly in (0, 1)")
z = norm.ppf((q + 1.0) / 2.0)
return loc - scale / (z * z)
# Sanity check: our formulas match SciPy.
loc, scale = -1.2, 2.5
# Use a truncated range because levy_l is extremely heavy-tailed.
x = np.linspace(
levy_l_dist.ppf(0.45, loc=loc, scale=scale),
levy_l_dist.ppf(0.999, loc=loc, scale=scale),
25,
)
pdf_max_err = np.max(np.abs(levy_l_pdf(x, loc=loc, scale=scale) - levy_l_dist.pdf(x, loc=loc, scale=scale)))
cdf_max_err = np.max(np.abs(levy_l_cdf(x, loc=loc, scale=scale) - levy_l_dist.cdf(x, loc=loc, scale=scale)))
q = np.linspace(0.05, 0.95, 7)
ppf_max_err = np.max(np.abs(levy_l_ppf(q, loc=loc, scale=scale) - levy_l_dist.ppf(q, loc=loc, scale=scale)))
roundtrip_max_err = np.max(np.abs(levy_l_cdf(levy_l_ppf(q, loc=loc, scale=scale), loc=loc, scale=scale) - q))
pdf_max_err, cdf_max_err, ppf_max_err, roundtrip_max_err
(8.326672684688674e-17,
1.6653345369377348e-16,
7.105427357601002e-15,
1.1102230246251565e-16)
4) Moments & Properties#
Mean, variance, skewness, kurtosis#
levy_l is so heavy-tailed that the usual raw moments do not exist as finite numbers.
Mean: diverges (for the standard form it diverges to \(-\infty\))
Variance: infinite
Skewness / kurtosis: undefined
What does remain well-behaved:
Quantiles (including the median)
Tail probabilities (via CDF/SF)
A useful reparameterization is \(Y = \mathrm{loc} - X > 0\). Then \(Y\) has the (right) Lévy distribution and can be viewed as an inverse-gamma random variable with shape \(\alpha=1/2\):
From inverse-gamma moment conditions, \(\mathbb{E}[Y^p]\) exists iff \(p<\alpha=1/2\).
MGF / characteristic function#
Because the mean is infinite, the moment generating function is not finite in any neighborhood of 0. However, since the support is bounded above, the one-sided MGF exists for \(t>0\):
The characteristic function exists for all real \(t\):
where \(\sqrt{\cdot}\) is the principal complex square root.
Entropy#
The differential entropy is finite and has a closed form via the inverse-gamma identity above:
with \(\gamma\approx 0.57721\) the Euler–Mascheroni constant.
loc, scale = 0.0, 1.0
mean, var, skew, kurt = levy_l_dist.stats(loc=loc, scale=scale, moments="mvsk")
entropy_scipy = levy_l_dist.entropy(loc=loc, scale=scale)
entropy_closed = (0.5 + np.log(4 * np.sqrt(np.pi)) + 1.5 * np.euler_gamma) + np.log(scale)
mean, var, skew, kurt, entropy_scipy, entropy_closed
(inf, inf, nan, nan, 3.3244828015117402, 3.32448280139689)
5) Parameter Interpretation#
locis an upper endpoint: samples always satisfy \(X<\mathrm{loc}\).scalecontrols how far belowlocthe distribution typically lies and how heavy the tail is. Largerscalepushes mass further left and makes extreme negative values more likely.
Because levy_l is a location/scale family, changing loc shifts the distribution; changing scale stretches it.
# PDF for varying scale (truncate the far tail so the main shape is visible).
loc = 0.0
scale_values = [0.5, 1.0, 2.0, 5.0]
q_lo, q_hi = 0.4, 0.9995
x_min = levy_l_dist.ppf(q_lo, loc=loc, scale=max(scale_values))
x_max = levy_l_dist.ppf(q_hi, loc=loc, scale=min(scale_values))
x = np.linspace(x_min, x_max, 900)
fig = go.Figure()
for s in scale_values:
fig.add_trace(go.Scatter(x=x, y=levy_l_dist.pdf(x, loc=loc, scale=s), name=f"scale={s}"))
fig.update_layout(
title="levy_l PDF (varying scale; tail truncated)",
xaxis_title="x",
yaxis_title="pdf",
)
fig.show()
# CDF for varying loc (scale fixed).
scale = 1.0
loc_values = [-2.0, 0.0, 2.0]
q_lo, q_hi = 0.4, 0.9995
x_min = min(levy_l_dist.ppf(q_lo, loc=mu, scale=scale) for mu in loc_values)
x_max = max(levy_l_dist.ppf(q_hi, loc=mu, scale=scale) for mu in loc_values)
x = np.linspace(x_min, x_max, 900)
fig = go.Figure()
for mu in loc_values:
fig.add_trace(go.Scatter(x=x, y=levy_l_dist.cdf(x, loc=mu, scale=scale), name=f"loc={mu}"))
fig.update_layout(
title="levy_l CDF (varying loc; scale fixed)",
xaxis_title="x",
yaxis_title="cdf",
)
fig.show()
6) Derivations#
6.1 From a standard normal to levy_l#
Let \(Z\sim\mathcal{N}(0,1)\) and define \(Y=1/Z^2\) (so \(Y>0\)). For \(y>0\),
Differentiating gives
which is the (right-skewed) Lévy distribution. Now set \(X = -Y\), so \(X<0\) and \(X\) has the standard levy_l PDF.
Finally, apply location/scale:
6.2 Expectation and variance diverge#
For large negative \(x\) (equivalently large \(y=\mathrm{loc}-x\)), the exponential term in the PDF is essentially 1, so
Then the mean integral behaves like
which diverges. Similarly, the second moment behaves like \(\int u^{1/2} du\) and diverges as well.
6.3 Likelihood (i.i.d. sample) and profile MLE#
Given observations \(x_1,\dots,x_n\) and parameters with \(\mathrm{loc} > \max_i x_i\) and \(\mathrm{scale}>0\), define \(y_i=\mathrm{loc}-x_i>0\). The log-likelihood is
If loc is held fixed, maximizing over scale has a closed form:
Plugging this into \(\ell\) gives a 1D profile log-likelihood in loc, which can be optimized numerically.
def scale_mle_given_loc(loc: float, x: np.ndarray) -> float:
x = np.asarray(x, dtype=float)
y = loc - x
if np.any(y <= 0):
return np.nan
return len(x) / np.sum(1.0 / y)
def levy_l_loglik(loc: float, scale: float, x: np.ndarray) -> float:
x = np.asarray(x, dtype=float)
y = loc - x
if scale <= 0 or np.any(y <= 0):
return -np.inf
n = len(x)
return (
0.5 * n * np.log(scale)
- 1.5 * np.sum(np.log(y))
- 0.5 * scale * np.sum(1.0 / y)
- 0.5 * n * np.log(2 * np.pi)
)
def profile_loglik(loc: float, x: np.ndarray) -> float:
s_hat = scale_mle_given_loc(loc, x)
if not np.isfinite(s_hat):
return -np.inf
return levy_l_loglik(loc, s_hat, x)
# Demonstration: profile MLE vs SciPy's fit
loc_true, scale_true = -1.0, 2.0
x = levy_l_dist.rvs(loc=loc_true, scale=scale_true, size=3000, random_state=rng)
loc_fit, scale_fit = levy_l_dist.fit(x)
max_x = np.max(x)
eps = 1e-12
lower = max_x + eps
gap = float(np.median(max_x - x))
upper = max_x + 50.0 * max(gap, 1e-3)
res = optimize.minimize_scalar(lambda mu: -profile_loglik(mu, x), bounds=(lower, upper), method="bounded")
loc_mle = float(res.x)
scale_mle = float(scale_mle_given_loc(loc_mle, x))
{
"true": (loc_true, scale_true),
"scipy_fit": (loc_fit, scale_fit),
"profile_mle": (loc_mle, scale_mle),
"optimizer_success": bool(res.success),
}
{'true': (-1.0, 2.0),
'scipy_fit': (-0.9859227840906253, 2.004317043822053),
'profile_mle': (-0.9859283784918061, 2.0043267837576417),
'optimizer_success': True}
7) Sampling & Simulation#
NumPy-only sampling algorithm#
Using the normal transformation:
Sample \(Z\sim\mathcal{N}(0,1)\).
Return \(X = \mathrm{loc} - \mathrm{scale}/Z^2\).
This produces an exact sample from levy_l(loc, scale).
Because \(Z\) can be arbitrarily close to 0, this algorithm sometimes produces enormous negative values. That is expected (it is the heavy tail).
def levy_l_rvs_numpy(
loc: float = 0.0,
scale: float = 1.0,
size=1,
*,
rng: np.random.Generator | None = None,
) -> np.ndarray:
"""Sample from levy_l(loc, scale) using NumPy only."""
if rng is None:
rng = np.random.default_rng()
if not np.isfinite(loc):
raise ValueError("loc must be finite")
if not np.isfinite(scale) or scale <= 0:
raise ValueError("scale must be positive and finite")
z = rng.standard_normal(size)
# Exact zeros are extremely rare but would cause division by zero.
if np.ndim(z) == 0:
while z == 0:
z = rng.standard_normal()
return loc - scale / (z * z)
while True:
mask = z == 0
if not np.any(mask):
break
z[mask] = rng.standard_normal(np.sum(mask))
return loc - scale / (z * z)
# Quick check: quantiles match SciPy.
loc, scale = 0.0, 1.0
samples = levy_l_rvs_numpy(loc=loc, scale=scale, size=200_000, rng=rng)
q = np.array([0.1, 0.5, 0.9])
np.quantile(samples, q), levy_l_dist.ppf(q, loc=loc, scale=scale)
(array([-62.818698, -2.192106, -0.367231]),
array([-63.328118, -2.198109, -0.369612]))
8) Visualization#
Because levy_l is extremely heavy-tailed, plots are often most informative on a truncated quantile range (e.g., showing only the upper 60% of the distribution).
loc, scale = 0.0, 1.0
x = np.linspace(
levy_l_dist.ppf(0.4, loc=loc, scale=scale),
levy_l_dist.ppf(0.999, loc=loc, scale=scale),
900,
)
fig = make_subplots(rows=1, cols=2, subplot_titles=("PDF (truncated)", "CDF (truncated)"))
fig.add_trace(
go.Scatter(x=x, y=levy_l_dist.pdf(x, loc=loc, scale=scale), name="SciPy pdf"),
row=1,
col=1,
)
fig.add_trace(
go.Scatter(
x=x,
y=levy_l_pdf(x, loc=loc, scale=scale),
name="Formula pdf",
line=dict(dash="dash"),
),
row=1,
col=1,
)
fig.add_trace(
go.Scatter(x=x, y=levy_l_dist.cdf(x, loc=loc, scale=scale), name="SciPy cdf"),
row=1,
col=2,
)
fig.add_trace(
go.Scatter(
x=x,
y=levy_l_cdf(x, loc=loc, scale=scale),
name="Formula cdf",
line=dict(dash="dash"),
),
row=1,
col=2,
)
fig.update_xaxes(title_text="x", row=1, col=1)
fig.update_xaxes(title_text="x", row=1, col=2)
fig.update_yaxes(title_text="density", row=1, col=1)
fig.update_yaxes(title_text="cdf", row=1, col=2)
fig.update_layout(legend=dict(orientation="h", yanchor="bottom", y=1.02, xanchor="right", x=1))
fig.show()
# Monte Carlo samples (NumPy-only) with a truncated histogram for visualization.
n = 300_000
samples = levy_l_rvs_numpy(loc=loc, scale=scale, size=n, rng=rng)
lo, hi = x[0], x[-1]
samples_trunc = samples[(samples >= lo) & (samples <= hi)]
fig = go.Figure()
fig.add_trace(
go.Histogram(
x=samples_trunc,
nbinsx=80,
histnorm="probability density",
name=f"MC histogram (n={n:,}, truncated)",
opacity=0.5,
)
)
fig.add_trace(go.Scatter(x=x, y=levy_l_dist.pdf(x, loc=loc, scale=scale), name="SciPy pdf"))
fig.update_layout(title="Monte Carlo samples (tail truncated for plotting)", xaxis_title="x", yaxis_title="density")
fig.show()
9) SciPy Integration#
scipy.stats.levy_l provides the usual distribution API:
levy_l.pdf(x, loc=0, scale=1)levy_l.cdf(x, loc=0, scale=1)levy_l.rvs(loc=0, scale=1, size=..., random_state=...)levy_l.fit(data, ...)(MLE)
A common pattern is to freeze the distribution: rv = levy_l(loc=..., scale=...), then call rv.pdf, rv.cdf, rv.rvs, etc.
loc_true, scale_true = -1.0, 2.0
x = levy_l_dist.rvs(loc=loc_true, scale=scale_true, size=5000, random_state=rng)
# Fit both parameters.
loc_hat, scale_hat = levy_l_dist.fit(x)
# Fit with loc fixed (useful when loc is known from the problem).
loc_hat_fixed, scale_hat_fixed = levy_l_dist.fit(x, floc=loc_true)
# Example evaluations
rv = levy_l_dist(loc=loc_hat, scale=scale_hat)
x0 = np.array([loc_hat - 0.1, loc_hat - 1.0, loc_hat - 10.0])
{
"true": (loc_true, scale_true),
"fit": (loc_hat, scale_hat),
"fit_floc": (loc_hat_fixed, scale_hat_fixed),
"pdf(x0)": rv.pdf(x0),
"cdf(x0)": rv.cdf(x0),
}
{'true': (-1.0, 2.0),
'fit': (-1.0126691262593843, 1.9297875350509812),
'fit_floc': (-1.0, 1.9673828125000024),
'pdf(x0)': array([0.00113 , 0.211162, 0.015913]),
'cdf(x0)': array([0.999989, 0.835218, 0.339551])}
10) Statistical Use Cases#
Hypothesis testing (goodness-of-fit)#
If you have specified parameters (not estimated from the same sample), you can use a goodness-of-fit test such as Kolmogorov–Smirnov (KS).
Caveat: if you estimate parameters from the data and then run KS on the same data, the usual p-values are no longer exact (you need a corrected procedure or a bootstrap).
Bayesian modeling#
If loc is known, the likelihood for scale has a convenient form. With \(y_i=\mathrm{loc}-x_i>0\):
So a Gamma prior on scale is conjugate.
Generative modeling#
Because the distribution is supported on \((-\infty,\mathrm{loc})\), it naturally models one-sided negative shocks. Summing i.i.d. draws produces a process with rare, very large downward jumps.
# Hypothesis testing example: KS test when parameters are known.
from scipy.stats import kstest
loc_true, scale_true = 0.0, 1.5
x = levy_l_dist.rvs(loc=loc_true, scale=scale_true, size=1500, random_state=rng)
stat, pvalue = kstest(x, "levy_l", args=(loc_true, scale_true))
stat, pvalue
(0.03198370647931559, 0.0909550922196165)
# Bayesian modeling example: conjugate Gamma posterior for scale when loc is known.
from scipy.stats import gamma
loc = 0.0
scale_true = 2.0
x = levy_l_dist.rvs(loc=loc, scale=scale_true, size=300, random_state=rng)
y = loc - x
# Prior: scale ~ Gamma(alpha0, rate=beta0)
alpha0, beta0 = 2.0, 1.0
alpha_post = alpha0 + len(x) / 2.0
beta_post = beta0 + 0.5 * np.sum(1.0 / y)
# SciPy's gamma uses a 'scale' parameter = 1/rate.
prior = gamma(a=alpha0, scale=1.0 / beta0)
post = gamma(a=alpha_post, scale=1.0 / beta_post)
post_mean = post.mean()
post_ci = post.ppf([0.05, 0.95])
grid = np.linspace(post.ppf(0.001), post.ppf(0.999), 600)
fig = go.Figure()
fig.add_trace(go.Scatter(x=grid, y=prior.pdf(grid), name="prior", line=dict(dash="dot")))
fig.add_trace(go.Scatter(x=grid, y=post.pdf(grid), name="posterior"))
fig.add_vline(x=scale_true, line_dash="dash", line_color="black", annotation_text="true scale")
fig.update_layout(
title=f"Posterior for scale (loc known). Posterior mean={post_mean:.3f}, 90% CI=[{post_ci[0]:.3f}, {post_ci[1]:.3f}]",
xaxis_title="scale",
yaxis_title="density",
)
fig.show()
# Generative modeling example: a process with one-sided heavy-tailed negative shocks.
n_steps = 200
shock_scale = 0.15
shocks = levy_l_rvs_numpy(loc=0.0, scale=shock_scale, size=n_steps, rng=rng)
path = np.cumsum(shocks)
fig = make_subplots(rows=1, cols=2, subplot_titles=("Shocks", "Cumulative sum"))
fig.add_trace(go.Scatter(y=shocks, mode="lines+markers", name="shocks"), row=1, col=1)
fig.add_trace(go.Scatter(y=path, mode="lines+markers", name="path"), row=1, col=2)
fig.update_xaxes(title_text="time", row=1, col=1)
fig.update_xaxes(title_text="time", row=1, col=2)
fig.update_yaxes(title_text="value", row=1, col=1)
fig.update_yaxes(title_text="value", row=1, col=2)
fig.update_layout(showlegend=False)
fig.show()
11) Pitfalls#
Invalid parameters:
scalemust be strictly positive; the density is defined only for \(x<\mathrm{loc}\).Infinite moments: sample means/variances are unstable and can be dominated by rare extreme values.
Visualization requires care: a few samples can be extremely negative; use truncation or log-scaled tail plots.
Numerical issues:
For \(x\) extremely close to
loc, the PDF involves an \(\exp(-\mathrm{scale}/(2(\mathrm{loc}-x)))\) term that can underflow; uselogpdfwhen possible.ppf(q)for very small \(q\) produces extremely large magnitudes; avoid evaluating at \(q\approx 0\) in finite precision.
Fitting: MLE is sensitive to tail observations; consider fixing
locwhen known or using robust / Bayesian approaches.
12) Summary#
levy_lis a left-skewed, one-sided stable distribution with support \((-\infty,\mathrm{loc})\).It is the mirror of
levy: \(\mathrm{loc}-X\) is (right) Lévy and can be seen as an inverse-gamma with shape \(1/2\).Mean/variance (and higher raw moments) diverge; quantiles and tail probabilities are the right tools.
Exact NumPy-only sampling is easy via \(X=\mathrm{loc}-\mathrm{scale}/Z^2\) with \(Z\sim\mathcal{N}(0,1)\).
SciPy provides evaluation, simulation, and MLE fitting through
scipy.stats.levy_l.